Web Spam: a Survey with Vision for the Archivist
نویسندگان
چکیده
While Web archive quality is endangered by Web spam, a side effect of the high commercial value of top-ranked search-engine results, so far Web spam filtering technologies are rarely used byWeb archivists. In this paper we make the first attempt to disseminate existing methodology and envision a solution for Web archives to share knowledge and unite efforts in Web spam hunting. We survey the state of the art inWeb spam filtering illustrated by the recent Web spam challenge data sets and techniques and describe the filtering solution for archives envisioned in the LiWA—Living Web Archives project.
منابع مشابه
A Survey on Web Spam and Spam 2.0
In current scenario web is huge, highly distributive, open in nature and changing rapidly. The open nature of web is the main reason for rapid growth but it has imposed a challenge to Information Retrieval. The one of the biggest challenge is spam. We focus here to have a study on different forms of the web spam and its new variant called spam 2.0, existing detection methods proposed by differe...
متن کاملApproaches for Web Spam Detection
Spam is a major threat to web security. The web of trust is being abused by the spammers through their ever evolving new tactics for their personal gains. In fact, there is a long chain of spammers who are running huge business campaigns under the web. Spam causes underutilization of search engine resources and creates dissatisfaction among web community. Web Security being a prime challenge fo...
متن کاملارائه روشی مناسب برای دسته بندی نامه های الکترونیکی تبلیغاتی بر مبنای پروفایل کاربران
In general, Spam is related to satisfy or not satisfy the client and isn’t related to the content of the client’s email. According to this definition, problems arise in the field of marketing and advertising for example, it is possible that some of the advertising emails become spam for some users, and not spam for others. To deal with this problem, many researchers design an anti-s...
متن کاملSpam 2.0 State of the Art
Spam 2.0 is defined as the propagation of unsolicited, anonymous, mass content to infiltrate legitimate Web 2.0 applications. A fake eye-catching profile in social networking websites, a promotional review, a response to a thread in online forums with unsolicited content, or a manipulated Wiki page are examples of Spam 2.0. In this paper, the authors provide a comprehensive survey of the state-...
متن کاملA Survey on Web Spam Detection Methods: Taxonomy
Web spam refers to some techniques, which try to manipulate search engine ranking algorithms in order to raise web page position in search engine results. In the best case, spammers encourage viewers to visit their sites, and provide undeserved advertisement gains to the page owner. In the worst case, they use malicious contents in their pages and try to install malware on the victim’s machine....
متن کامل